NSF PAR Search | NSF Public Access Repository

BlinkNet: Software-Defined Deep Learning Analytics with Bounded Resources

Koga, Brian; Vanderweide, Theresa; Zhao, Xinghui; Zhang, Xuechen (January 2021, 3rd Workshop on Accelerated Machine Learning (AccML) Co-located with the HiPEAC 2021 Conference)

null (Ed.)

Deep neural networks (DNNs) have recently gained unprecedented success in various domains. In resource-constrained systems, QoS-aware DNNs are designed to meet latency requirements of mission-critical deep learning applications. However, none of the existing DNNs have been designed to satisfy both latency and memory bounds simultaneously as specified by end-users in the resource-constrained systems. In this paper, we propose BLINKNET, a runtime system that is able to guarantee both latency and memory/storage bounds via efficient QoS-aware per-layer approximation. We implement BLINKNET in Apache TVM and evaluate it using Cifar10-quick and VGG network models. Our experimental results show that BLINKNET can meet the latency and memory requirements with 2% accuracy loss on average.

Full Text Available

Search for: All records